# Compute the Dopaminergic Similarities of Participants

https://github.com/nmslib/hnswlib

## Install required libraries

```sh
make
```

## Run the import

```sh
python dopamine_participant_similarities.py -u <neo4j username> -p <password>  -b <bolt uri> -s <source directory>
```
## Remove DopamineASD and DopamineDD label

8 participants labeled with DopamineASD and 1 labeled with DopamineDD had a dopamineGeneticDosage vector with size diferent from the total number of genes marked with the label DopamineGene (n=251).

In order to compute the similarities the size of the dopamineGeneticDosage vector needs to be equal among all the participants, otherwise it will produce an error.

Thus, this 9 participants need to have their label removed in the neo4j database before running the similarities script:

```MATCH (n:DopamineASD)
WHERE size(n.dopamineGeneDosageVector) <> 251
REMOVE n:DopamineASD
MATCH (n:DopamineDD)
WHERE size(n.dopamineGeneDosageVector) <> 251
REMOVE n:DopamineDD
```

These participant had both deletion and duplication on the same gene or genes, which caused the vector size to be greater than 251 genes.